Data creation and modification

You can add, update, replace and delete your indexed data using different ways provided by Manticore. Manticore supports working with external storages such as databases, XML, CSV and TSV documents. For insert and delete operations, transaction mechanism is supported.

Also, for insert and replace queries, Manticore supports the Elasticsearch-like query format along with its own format. For details, see the corresponding examples in the Adding documents to a real-time table and REPLACE sections.

▪️ Adding documents to a table

Adding documents to a real-time table

If you are looking for information about adding documents to a plain table please read section about adding data from external storages.

Adding documents in a real-time manner is only supported for Real-Time and percolate tables. Corresponding SQL command or HTTP endpoint or a client's functions inserts new rows (documents) into a table with provided field values. Note that it is not necessary for a table to already exist before adding documents to it. If the table does not exist, Manticore will attempt to create it automatically. For more information, see Auto schema.

You can insert a single or multiple documents with values for all fields of the table or only part of them. In this case the other fields will be filled with their default values (0 for scalar types, empty string for text types).

Expressions are currently not supported in INSERT and the values should be explicitly specified.

The ID field/value can be omitted as RT and PQ tables support auto-id functionality. You can also use 0 as the id value to force automatic ID generation. Rows with duplicate IDs will not be overwritten by INSERT. You can use REPLACE for that.

When using the HTTP JSON protocol, two different request formats are available: a common Manticore format and an Elasticsearch-like one. Both formats are demonstrated in the examples.

Also, if you use JSON and the Manticore request format, note that the doc node is mandatory and all the values should be provided inside it.

‹›
  • SQL
  • JSON
  • Elasticsearch
  • PHP
  • Python
  • Javascript
  • Java
📋

General syntax:

INSERT INTO <table name> [(column, ...)]
VALUES (value, ...)
[, (...)]
INSERT INTO products(title,price) VALUES ('Crossbody Bag with Tassel', 19.85);
INSERT INTO products(title) VALUES ('Crossbody Bag with Tassel');
INSERT INTO products VALUES (0,'Yellow bag', 4.95);
‹›
Response
Query OK, 1 rows affected (0.00 sec)
Query OK, 1 rows affected (0.00 sec)
Query OK, 1 rows affected (0.00 sec)

Auto schema

Manticore has a mechanism for automatically creating tables when a specified table in the INSERT statement does not yet exist. This mechanism is enabled by default. To disable it, set auto_schema = 0 in the Searchd section of your Manticore config file.

By default, all text values in the VALUES clause are considered to be of the text type, with the exception of values that represent valid email addresses, which are treated as the string type.

If you try to INSERT multiple rows with different, incompatible value types for the same field, auto table creation will be canceled and an error message will be returned. However, if the different value types are compatible, the resulting field type will be the one that accommodates all the values. Some automatic data type conversions that may occur include:

  • mva -> mva64
  • uint -> bigint -> float
  • string -> text
‹›
  • SQL
  • JSON
📋
MySQL [(none)]> drop table if exists t; insert into t(i,f,t,s,j,b,m,mb) values(123,1.2,'text here','[email protected]','{"a": 123}',1099511627776,(1,2),(1099511627776,1099511627777)); desc t; select * from t;
‹›
Response
--------------
drop table if exists t
--------------

Query OK, 0 rows affected (0.42 sec)

--------------
insert into t(i,f,t,j,b,m,mb) values(123,1.2,'text here','{"a": 123}',1099511627776,(1,2),(1099511627776,1099511627777))
--------------

Query OK, 1 row affected (0.00 sec)

--------------
desc t
--------------

+-------+--------+----------------+
| Field | Type   | Properties     |
+-------+--------+----------------+
| id    | bigint |                |
| t     | text   | indexed stored |
| s     | string |                |
| j     | json   |                |
| i     | uint   |                |
| b     | bigint |                |
| f     | float  |                |
| m     | mva    |                |
| mb    | mva64  |                |
+-------+--------+----------------+
8 rows in set (0.00 sec)

--------------
select * from t
--------------

+---------------------+------+---------------+----------+------+-----------------------------+-----------+---------------+------------+
| id                  | i    | b             | f        | m    | mb                          | t         | s             | j          |
+---------------------+------+---------------+----------+------+-----------------------------+-----------+---------------+------------+
| 5045949922868723723 |  123 | 1099511627776 | 1.200000 | 1,2  | 1099511627776,1099511627777 | text here | [email protected] | {"a": 123} |
+---------------------+------+---------------+----------+------+-----------------------------+-----------+---------------+------------+
1 row in set (0.00 sec)

Auto ID

There is an auto ID generation functionality for column ID of documents inserted or replaced into an real-time or a Percolate table. The generator produces a unique ID of a document with some guarantees and should not be considered an auto-incremented ID.

The value of ID generated is guaranteed to be unique under the following conditions:

  • server_id value of the current server is in range of 0 to 127 and is unique among nodes in the cluster or it uses the default value generated from MAC address as a seed
  • system time does not change for the Manticore node between server restarts
  • auto ID is generated fewer than 16 million times per second between search server restarts

The auto ID generator creates 64 bit integer for a document ID and uses the following schema:

  • 0 to 23 bits is a counter that gets incremented on every call to auto ID generator
  • 24 to 55 bits is a unix timestamp of the server start
  • 56 to 63 bits is a server_id

This schema allows to be sure that the generated ID is unique among all nodes at the cluster and that data inserted into different cluster nodes does not create collisions between the nodes.

That is why the first ID from the generator used for auto ID is NOT 1 but a larger number. Also documents stream inserted into a table might have not sequential ID values if inserts into other tables happen between the calls as the ID generator is single in the server and shared between all its tables.

‹›
  • SQL
  • JSON
  • PHP
  • Python
  • Javascript
  • Java
📋
INSERT INTO products(title,price) VALUES ('Crossbody Bag with Tassel', 19.85);
INSERT INTO products VALUES (0,'Yello bag', 4.95);
select * from products;
‹›
Response
+---------------------+-----------+---------------------------+
| id                  | price     | title                     |
+---------------------+-----------+---------------------------+
| 1657860156022587404 | 19.850000 | Crossbody Bag with Tassel |
| 1657860156022587405 |  4.950000 | Yello bag                 |
+---------------------+-----------+---------------------------+

Bulk adding documents

You can insert into a real-time table not just a single document, but as many as you want. It's ok to insert into a real-time table in batches of tens of thousands of documents. What's important to know in this case:

  • the larger the batch the higher is the latency of each insert operation
  • the larger the batch the higher indexation speed you can expect
  • each batch insert operation is considered a single transaction with atomicity guarantee, so you will either have all the new documents in the table at once or in case of a failure none of them will be added
  • you might want to increase max_packet_size value to allow bigger batches
‹›
  • SQL
  • JSON
  • Elasticsearch
  • PHP
  • Python
  • Javascript
  • Java
📋

For bulk insert just provide more documents in brackets after VALUES(). The syntax is:

INSERT INTO <table name>[(column1, column2, ...)] VALUES ()[,(value1,[value2, ...])]

Optional column name list lets you explicitly specify values for some of the columns present in the table. All the other columns will be filled with their default values (0 for scalar types, empty string for string types).

For example:

INSERT INTO products(title,price) VALUES ('Crossbody Bag with Tassel', 19.85), ('microfiber sheet set', 19.99), ('Pet Hair Remover Glove', 7.99);
‹›
Response
Query OK, 3 rows affected (0.01 sec)

Expressions are not currently supported in INSERT and values should be explicitly specified.

Inserting multi-value attributes (MVA) values

Multi-value attributes (MVA) are inserted as arrays of numbers.

‹›
  • SQL
  • JSON
  • PHP
  • Python
  • Javascript
  • Java
📋
INSERT INTO products(title, sizes) VALUES('shoes', (40,41,42,43));

Inserting JSON

JSON value can be inserted as as an escaped string (via SQL, HTTP, PHP) or as a JSON object (via HTTP).

‹›
  • SQL
  • JSON
  • PHP
  • Python
  • Javascript
  • Java
📋
INSERT INTO products VALUES (1, 'shoes', '{"size": 41, "color": "red"}');